AAAI.2019 - Reasoning under Uncertainty | Cool Papers

#1 Active Preference Learning Based on Generalized Gini Functions: Application to the Multiagent Knapsack Problem [PDF] [Copy] [Kimi¹] [REL]

Authors: Nadjet Bourdache, Patrice Perny

We consider the problem of actively eliciting preferences from a Decision Maker supervising a collective decision process in the context of fair multiagent combinatorial optimization. Individual preferences are supposed to be known and represented by linear utility functions defined on a combinatorial domain and the social utility is defined as a generalized Gini Social evaluation Function (GSF) for the sake of fairness. The GSF is a non-linear aggregation function parameterized by weighting coefficients which allow a fine control of the equity requirement in the aggregation of individual utilities. The paper focuses on the elicitation of these weights by active learning in the context of the fair multiagent knapsack problem. We introduce and compare several incremental decision procedures interleaving an adaptive preference elicitation procedure with a combinatorial optimization algorithm to determine a GSF-optimal solution. We establish an upper bound on the number of queries and provide numerical tests to show the efficiency of the proposed approach.

Subject: AAAI.2019 - Reasoning under Uncertainty

#2 Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications [PDF] [Copy] [Kimi] [REL]

Authors: Daniel S. Brown, Scott Niekum

Inverse reinforcement learning (IRL) infers a reward function from demonstrations, allowing for policy improvement and generalization. However, despite much recent interest in IRL, little work has been done to understand the minimum set of demonstrations needed to teach a specific sequential decisionmaking task. We formalize the problem of finding maximally informative demonstrations for IRL as a machine teaching problem where the goal is to find the minimum number of demonstrations needed to specify the reward equivalence class of the demonstrator. We extend previous work on algorithmic teaching for sequential decision-making tasks by showing a reduction to the set cover problem which enables an efficient approximation algorithm for determining the set of maximallyinformative demonstrations. We apply our proposed machine teaching algorithm to two novel applications: providing a lower bound on the number of queries needed to learn a policy using active IRL and developing a novel IRL algorithm that can learn more efficiently from informative demonstrations than a standard IRL approach.

Subject: AAAI.2019 - Reasoning under Uncertainty

#3 Robustness Guarantees for Bayesian Inference with Gaussian Processes [PDF] [Copy] [Kimi] [REL]

Authors: Luca Cardelli, Marta Kwiatkowska, Luca Laurenti, Andrea Patane

Bayesian inference and Gaussian processes are widely used in applications ranging from robotics and control to biological systems. Many of these applications are safety-critical and require a characterization of the uncertainty associated with the learning model and formal guarantees on its predictions. In this paper we define a robustness measure for Bayesian inference against input perturbations, given by the probability that, for a test point and a compact set in the input space containing the test point, the prediction of the learning model will remain δ−close for all the points in the set, for δ > 0. Such measures can be used to provide formal probabilistic guarantees for the absence of adversarial examples. By employing the theory of Gaussian processes, we derive upper bounds on the resulting robustness by utilising the Borell-TIS inequality, and propose algorithms for their computation. We evaluate our techniques on two examples, a GP regression problem and a fully-connected deep neural network, where we rely on weak convergence to GPs to study adversarial examples on the MNIST dataset.

Subject: AAAI.2019 - Reasoning under Uncertainty

#4 Probabilistic Logic Programming with Beta-Distributed Random Variables [PDF] [Copy] [Kimi] [REL]

Authors: Federico Cerutti, Lance Kaplan, Angelika Kimmig, Murat Şensoy

We enable aProbLog—a probabilistic logical programming approach—to reason in presence of uncertain probabilities represented as Beta-distributed random variables. We achieve the same performance of state-of-the-art algorithms for highly specified and engineered domains, while simultaneously we maintain the flexibility offered by aProbLog in handling complex relational domains. Our motivation is that faithfully capturing the distribution of probabilities is necessary to compute an expected utility for effective decision making under uncertainty: unfortunately, these probability distributions can be highly uncertain due to sparse data. To understand and accurately manipulate such probability distributions we need a well-defined theoretical framework that is provided by the Beta distribution, which specifies a distribution of probabilities representing all the possible values of a probability when the exact value is unknown.

Subject: AAAI.2019 - Reasoning under Uncertainty

#5 On Testing of Uniform Samplers [PDF] [Copy] [Kimi] [REL]

Authors: Sourav Chakraborty, Kuldeep S. Meel

Recent years have seen an unprecedented adoption of artificial intelligence in a wide variety of applications ranging from medical diagnosis, automobile industry, security to aircraft collision avoidance. Probabilistic reasoning is a key component of such modern artificial intelligence systems. Sampling techniques form the core of the state of the art probabilistic reasoning systems. The divide between the existence of sampling techniques that have strong theoretical guarantees but fail to scale and scalable techniques with weak or no theoretical guarantees mirrors the gap in software engineering between poor scalability of classical program synthesis techniques and billions of programs that are routinely used by practitioners. One bridge connecting the two extremes in the context of software engineering has been program testing. In contrast to testing for deterministic programs, where one trace is sufficient to prove the existence of a bug, in case of samplers one sample is typically not sufficient to prove non-conformity of the sampler to the desired distribution. This makes one wonder whether it is possible to design testing methodology to test whether a sampler under test generates samples close to a given distribution. The primary contribution of this paper is an affirmative answer to the above question when the given distribution is a uniform distribution: We design, to the best of our knowledge, the first algorithmic framework, Barbarik, to test whether the distribution generated is ε−close or η−far from the uniform distribution. In contrast to the sampling techniques that require an exponential or sub-exponential number of samples for sampler whose support can be represented by n bits, Barbarik requires only O(1/(η−ε)4) samples. We present a prototype implementation of Barbarik and use it to test three state of the art uniform samplers over the support defined by combinatorial constraints. Barbarik can provide a certificate of uniformity to one sampler and demonstrate nonuniformity for the other two samplers. Erratum: This research is supported in part by the National Research Foundation Singapore under its AI Singapore Programme (Award Number: [AISG-RP-2018-005])

Subject: AAAI.2019 - Reasoning under Uncertainty

#6 On the Hardness of Probabilistic Inference Relaxations [PDF] [Copy] [Kimi] [REL]

Authors: Supratik Chakraborty, Kuldeep S. Meel, Moshe Y. Vardi

A promising approach to probabilistic inference that has attracted recent attention exploits its reduction to a set of model counting queries. Since probabilistic inference and model counting are #P-hard, various relaxations are used in practice, with the hope that these relaxations allow efficient computation while also providing rigorous approximation guarantees. In this paper, we show that contrary to common belief, several relaxations used for model counting and its applications (including probablistic inference) do not really lead to computational efficiency in a complexity theoretic sense. Our arguments proceed by showing the corresponding relaxed notions of counting to be computationally hard. We argue that approximate counting with multiplicative tolerance and probabilistic guarantees of correctness is the only class of relaxations that provably simplifies the problem, given access to an NP-oracle. Finally, we show that for applications that compare probability estimates with a threshold, a new notion of relaxation with gaps between low and high thresholds can be used. This new relaxation allows efficient decision making in practice, given access to an NP-oracle, while also bounding the approximation error. Erratum: This research is supported in part by the National Research Foundation Singapore under its AI Singapore Programme (Award Number: [AISG-RP-2018-005])

Subject: AAAI.2019 - Reasoning under Uncertainty

#7 Learning Diverse Bayesian Networks [PDF] [Copy] [Kimi] [REL]

Authors: Cong Chen, Changhe Yuan

Much effort has been directed at developing algorithms for learning optimal Bayesian network structures from data. When given limited or noisy data, however, the optimal Bayesian network often fails to capture the true underlying network structure. One can potentially address the problem by finding multiple most likely Bayesian networks (K-Best) in the hope that one of them recovers the true model. However, it is often the case that some of the best models come from the same peak(s) and are very similar to each other; so they tend to fail together. Moreover, many of these models are not even optimal respective to any causal ordering, thus unlikely to be useful. This paper proposes a novel method for finding a set of diverse top Bayesian networks, called modes, such that each network is guaranteed to be optimal in a local neighborhood. Such mode networks are expected to provide a much better coverage of the true model. Based on a globallocal theorem showing that a mode Bayesian network must be optimal in all local scopes, we introduce an A* search algorithm to efficiently find top M Bayesian networks which are highly probable and naturally diverse. Empirical evaluations show that our top mode models have much better diversity as well as accuracy in discovering true underlying models than those found by K-Best.

Subject: AAAI.2019 - Reasoning under Uncertainty

#8 Path-Specific Counterfactual Fairness [PDF] [Copy] [Kimi] [REL]

Author: Silvia Chiappa

We consider the problem of learning fair decision systems from data in which a sensitive attribute might affect the decision along both fair and unfair pathways. We introduce a counterfactual approach to disregard effects along unfair pathways that does not incur in the same loss of individual-specific information as previous approaches. Our method corrects observations adversely affected by the sensitive attribute, and uses these to form a decision. We leverage recent developments in deep learning and approximate inference to develop a VAE-type method that is widely applicable to complex nonlinear models.

Subject: AAAI.2019 - Reasoning under Uncertainty

#9 Efficient Optimal Approximation of Discrete Random Variables for Estimation of Probabilities of Missing Deadlines [PDF] [Copy] [Kimi] [REL]

Authors: Liat Cohen, Gera Weiss

We present an efficient algorithm that, given a discrete random variable X and a number m, computes a random variable whose support is of size at most m and whose Kolmogorov distance from X is minimal. We present some variants of the algorithm, analyse their correctness and computational complexity, and present a detailed empirical evaluation that shows how they performs in practice. The main application that we examine, which is our motivation for this work, is estimation of the probability of missing deadlines in series-parallel schedules. Since exact computation of these probabilities is NP-hard, we propose to use the algorithms described in this paper to obtain an approximation.

Subject: AAAI.2019 - Reasoning under Uncertainty

#10 Fast Relational Probabilistic Inference and Learning: Approximate Counting via Hypergraphs [PDF] [Copy] [Kimi] [REL]

Authors: Mayukh Das, Devendra Singh Dhami, Gautam Kunapuli, Kristian Kersting, Sriraam Natarajan

Counting the number of true instances of a clause is arguably a major bottleneck in relational probabilistic inference and learning. We approximate counts in two steps: (1) transform the fully grounded relational model to a large hypergraph, and partially-instantiated clauses to hypergraph motifs; (2) since the expected counts of the motifs are provably the clause counts, approximate them using summary statistics (in/outdegrees, edge counts, etc). Our experimental results demonstrate the efficiency of these approximations, which can be applied to many complex statistical relational models, and can be significantly faster than state-of-the-art, both for inference and learning, without sacrificing effectiveness.

Subject: AAAI.2019 - Reasoning under Uncertainty

#11 Exact and Approximate Weighted Model Integration with Probability Density Functions Using Knowledge Compilation [PDF] [Copy] [Kimi] [REL]

Authors: Pedro Zuidberg Dos Martires, Anton Dries, Luc De Raedt

Weighted model counting has recently been extended to weighted model integration, which can be used to solve hybrid probabilistic reasoning problems. Such problems involve both discrete and continuous probability distributions. We show how standard knowledge compilation techniques (to SDDs and d-DNNFs) apply to weighted model integration, and use it in two novel solvers, one exact and one approximate solver. Furthermore, we extend the class of employable weight functions to actual probability density functions instead of mere polynomial weight functions.

Subject: AAAI.2019 - Reasoning under Uncertainty

#12 Marginal Inference in Continuous Markov Random Fields Using Mixtures [PDF] [Copy] [Kimi¹] [REL]

Authors: Yuanzhen Guo, Hao Xiong, Nicholas Ruozzi

Exact marginal inference in continuous graphical models is computationally challenging outside of a few special cases. Existing work on approximate inference has focused on approximately computing the messages as part of the loopy belief propagation algorithm either via sampling methods or moment matching relaxations. In this work, we present an alternative family of approximations that, instead of approximating the messages, approximates the beliefs in the continuous Bethe free energy using mixture distributions. We show that these types of approximations can be combined with numerical quadrature to yield algorithms with both theoretical guarantees on the quality of the approximation and significantly better practical performance in a variety of applications that are challenging for current state-of-the-art methods.

Subject: AAAI.2019 - Reasoning under Uncertainty

#13 A Generative Model for Dynamic Networks with Applications [PDF] [Copy] [Kimi] [REL]

Authors: Shubham Gupta, Gaurav Sharma, Ambedkar Dukkipati

Networks observed in real world like social networks, collaboration networks etc., exhibit temporal dynamics, i.e. nodes and edges appear and/or disappear over time. In this paper, we propose a generative, latent space based, statistical model for such networks (called dynamic networks). We consider the case where the number of nodes is fixed, but the presence of edges can vary over time. Our model allows the number of communities in the network to be different at different time steps. We use a neural network based methodology to perform approximate inference in the proposed model and its simplified version. Experiments done on synthetic and real world networks for the task of community detection and link prediction demonstrate the utility and effectiveness of our model as compared to other similar existing approaches.

Subject: AAAI.2019 - Reasoning under Uncertainty

#14 Collective Online Learning of Gaussian Processes in Massive Multi-Agent Systems [PDF] [Copy] [Kimi] [REL]

Authors: Trong Nghia Hoang, Quang Minh Hoang, Kian Hsiang Low, Jonathan How

This paper presents a novel Collective Online Learning of Gaussian Processes (COOL-GP) framework for enabling a massive number of GP inference agents to simultaneously perform (a) efficient online updates of their GP models using their local streaming data with varying correlation structures and (b) decentralized fusion of their resulting online GP models with different learned hyperparameter settings and inducing inputs. To realize this, we exploit the notion of a common encoding structure to encapsulate the local streaming data gathered by any GP inference agent into summary statistics based on our proposed representation, which is amenable to both an efficient online update via an importance sampling trick as well as multi-agent model fusion via decentralized message passing that can exploit sparse connectivity among agents for improving efficiency and enhance the robustness of our framework against transmission loss. We provide a rigorous theoretical analysis of the approximation loss arising from our proposed representation to achieve efficient online updates and model fusion. Empirical evaluations show that COOL-GP is highly effective in model fusion, resilient to information disparity between agents, robust to transmission loss, and can scale to thousands of agents.

Subject: AAAI.2019 - Reasoning under Uncertainty

#15 MFBO-SSM: Multi-Fidelity Bayesian Optimization for Fast Inference in State-Space Models [PDF] [Copy] [Kimi] [REL]

Authors: Mahdi Imani, Seyede Fatemeh Ghoreishi, Douglas Allaire, Ulisses M. Braga-Neto

Nonlinear state-space models are ubiquitous in modeling real-world dynamical systems. Sequential Monte Carlo (SMC) techniques, also known as particle methods, are a well-known class of parameter estimation methods for this general class of state-space models. Existing SMC-based techniques rely on excessive sampling of the parameter space, which makes their computation intractable for large systems or tall data sets. Bayesian optimization techniques have been used for fast inference in state-space models with intractable likelihoods. These techniques aim to find the maximum of the likelihood function by sequential sampling of the parameter space through a single SMC approximator. Various SMC approximators with different fidelities and computational costs are often available for sample-based likelihood approximation. In this paper, we propose a multi-fidelity Bayesian optimization algorithm for the inference of general nonlinear state-space models (MFBO-SSM), which enables simultaneous sequential selection of parameters and approximators. The accuracy and speed of the algorithm are demonstrated by numerical experiments using synthetic gene expression data from a gene regulatory network model and real data from the VIX stock price index.

Subject: AAAI.2019 - Reasoning under Uncertainty

#16 Polynomial-Time Probabilistic Reasoning with Partial Observations via Implicit Learning in Probability Logics [PDF] [Copy] [Kimi¹] [REL]

Author: Brendan Juba

Standard approaches to probabilistic reasoning require that one possesses an explicit model of the distribution in question. But, the empirical learning of models of probability distributions from partial observations is a problem for which efficient algorithms are generally not known. In this work we consider the use of bounded-degree fragments of the “sum-of-squares” logic as a probability logic. Prior work has shown that we can decide refutability for such fragments in polynomial-time. We propose to use such fragments to decide queries about whether a given probability distribution satisfies a given system of constraints and bounds on expected values. We show that in answering such queries, such constraints and bounds can be implicitly learned from partial observations in polynomial-time as well. It is known that this logic is capable of deriving many bounds that are useful in probabilistic analysis. We show here that it furthermore captures key polynomial-time fragments of resolution. Thus, these fragments are also quite expressive.

Subject: AAAI.2019 - Reasoning under Uncertainty

#17 Randomized Strategies for Robust Combinatorial Optimization [PDF] [Copy] [Kimi] [REL]

Authors: Yasushi Kawase, Hanna Sumita

In this paper, we study the following robust optimization problem. Given an independence system and candidate objective functions, we choose an independent set, and then an adversary chooses one objective function, knowing our choice. The goal is to find a randomized strategy (i.e., a probability distribution over the independent sets) that maximizes the expected objective value in the worst case. This problem is fundamental in wide areas such as artificial intelligence, machine learning, game theory and optimization. To solve the problem, we propose two types of schemes for designing approximation algorithms. One scheme is for the case when objective functions are linear. It first finds an approximately optimal aggregated strategy and then retrieves a desired solution with little loss of the objective value. The approximation ratio depends on a relaxation of an independence system polytope. As applications, we provide approximation algorithms for a knapsack constraint or a matroid intersection by developing appropriate relaxations and retrievals. The other scheme is based on the multiplicative weights update (MWU) method. The direct application of the MWU method does not yield a strict multiplicative approximation algorithm but yield one with an additional additive error term. A key technique to overcome the issue is to introduce a new concept called (η,γ)-reductions for objective functions with parameters η and γ. We show that our scheme outputs a nearly α-approximate solution if there exists an α-approximation algorithm for a subproblem defined by (η,γ)-reductions. This improves approximation ratios in previous results. Using our result, we provide approximation algorithms when the objective functions are submodular or correspond to the cardinality robustness for the knapsack problem.

Subject: AAAI.2019 - Reasoning under Uncertainty

#18 Dirichlet Multinomial Mixture with Variational Manifold Regularization: Topic Modeling over Short Texts [PDF] [Copy] [Kimi] [REL]

Authors: Ximing Li, Jiaojiao Zhang, Jihong Ouyang

Conventional topic models suffer from a severe sparsity problem when facing extremely short texts such as social media posts. The family of Dirichlet multinomial mixture (DMM) can handle the sparsity problem, however, they are still very sensitive to ordinary and noisy words, resulting in inaccurate topic representations at the document level. In this paper, we alleviate this problem by preserving local neighborhood structure of short texts, enabling to spread topical signals among neighboring documents, so as to correct the inaccurate topic representations. This is achieved by using variational manifold regularization, constraining the close short texts should have similar variational topic representations. Upon this idea, we propose a novel Laplacian DMM (LapDMM) topic model. During the document graph construction, we further use the word mover’s distance with word embeddings to measure document similarities at the semantic level. To evaluate LapDMM, we compare it against the state-of-theart short text topic models on several traditional tasks. Experimental results demonstrate that our LapDMM achieves very significant performance gains over baseline models, e.g., achieving even about 0.2 higher scores on clustering and classification tasks in many cases.

Subject: AAAI.2019 - Reasoning under Uncertainty

#19 Finding All Bayesian Network Structures within a Factor of Optimal [PDF] [Copy] [Kimi] [REL]

Authors: Zhenyu A. Liao, Charupriya Sharma, James Cussens, Peter van Beek

A Bayesian network is a widely used probabilistic graphical model with applications in knowledge discovery and prediction. Learning a Bayesian network (BN) from data can be cast as an optimization problem using the well-known score-andsearch approach. However, selecting a single model (i.e., the best scoring BN) can be misleading or may not achieve the best possible accuracy. An alternative to committing to a single model is to perform some form of Bayesian or frequentist model averaging, where the space of possible BNs is sampled or enumerated in some fashion. Unfortunately, existing approaches for model averaging either severely restrict the structure of the Bayesian network or have only been shown to scale to networks with fewer than 30 random variables. In this paper, we propose a novel approach to model averaging inspired by performance guarantees in approximation algorithms. Our approach has two primary advantages. First, our approach only considers credible models in that they are optimal or near-optimal in score. Second, our approach is more efficient and scales to significantly larger Bayesian networks than existing approaches.

Subject: AAAI.2019 - Reasoning under Uncertainty

#20 Interleave Variational Optimization with Monte Carlo Sampling: A Tale of Two Approximate Inference Paradigms [PDF] [Copy] [Kimi] [REL]

Authors: Qi Lou, Rina Dechter, Alexander Ihler

Computing the partition function of a graphical model is a fundamental task in probabilistic inference. Variational bounds and Monte Carlo methods, two important approximate paradigms for this task, each has its respective strengths for solving different types of problems, but it is often nontrivial to decide which one to apply to a particular problem instance without significant prior knowledge and a high level of expertise. In this paper, we propose a general framework that interleaves optimization of variational bounds (via message passing) with Monte Carlo sampling. Our adaptive interleaving policy can automatically balance the computational effort between these two schemes in an instance-dependent way, which provides our framework with the strengths of both schemes, leads to tighter anytime bounds and an unbiased estimate of the partition function, and allows flexible tradeoffs between memory, time, and solution quality. We verify our approach empirically on real-world problems taken from recent UAI inference competitions.

Subject: AAAI.2019 - Reasoning under Uncertainty

#21 Robust Ordinal Embedding from Contaminated Relative Comparisons [PDF] [Copy] [Kimi] [REL]

Authors: Ke Ma, Qianqian Xu, Xiaochun Cao

Existing ordinal embedding methods usually follow a twostage routine: outlier detection is first employed to pick out the inconsistent comparisons; then an embedding is learned from the clean data. However, learning in a multi-stage manner is well-known to suffer from sub-optimal solutions. In this paper, we propose a unified framework to jointly identify the contaminated comparisons and derive reliable embeddings. The merits of our method are three-fold: (1) By virtue of the proposed unified framework, the sub-optimality of traditional methods is largely alleviated; (2) The proposed method is aware of global inconsistency by minimizing a corresponding cost, while traditional methods only involve local inconsistency; (3) Instead of considering the nuclear norm heuristics, we adopt an exact solution for rank equality constraint. Our studies are supported by experiments with both simulated examples and real-world data. The proposed framework provides us a promising tool for robust ordinal embedding from the contaminated comparisons.

Subject: AAAI.2019 - Reasoning under Uncertainty

#22 On Lifted Inference Using Neural Embeddings [PDF] [Copy] [Kimi] [REL]

Authors: Mohammad Maminur Islam, Somdeb Sarkhel, Deepak Venugopal

We present a dense representation for Markov Logic Networks (MLNs) called Obj2Vec that encodes symmetries in the MLN structure. Identifying symmetries is a key challenge for lifted inference algorithms and we leverage advances in neural networks to learn symmetries which are hard to specify using hand-crafted features. Specifically, we learn an embedding for MLN objects that predicts the context of an object, i.e., objects that appear along with it in formulas of the MLN, since common contexts indicate symmetry in the distribution. Importantly, our formulation leverages well-known skip-gram models that allow us to learn the embedding efficiently. Finally, to reduce the size of the ground MLN, we sample objects based on their learned embeddings. We integrate Obj2Vec with several inference algorithms, and show the scalability and accuracy of our approach compared to other state-of-the-art methods.

Subject: AAAI.2019 - Reasoning under Uncertainty

#23 Anytime Recursive Best-First Search for Bounding Marginal MAP [PDF] [Copy] [Kimi¹] [REL]

Authors: Radu Marinescu, Akihiro Kishimoto, Adi Botea, Rina Dechter, Alexander Ihler

Marginal MAP is a difficult mixed inference task for graphical models. Existing state-of-the-art solvers for this task are based on a hybrid best-first and depth-first search scheme that allows them to compute upper and lower bounds on the optimal solution value in an anytime fashion. These methods however are memory intensive schemes (via the best-first component) and do not have an efficient memory management mechanism. For this reason, they are often less effective in practice, especially on difficult problem instances with very large search spaces. In this paper, we introduce a new recursive best-first search based bounding scheme that operates efficiently within limited memory and computes anytime upper and lower bounds that improve over time. An empirical evaluation demonstrates the effectiveness of our proposed approach against current solvers.

Subject: AAAI.2019 - Reasoning under Uncertainty

#24 Semi-Parametric Sampling for Stochastic Bandits with Many Arms [PDF] [Copy] [Kimi] [REL]

Authors: Mingdong Ou, Nan Li, Cheng Yang, Shenghuo Zhu, Rong Jin

We consider the stochastic bandit problem with a large candidate arm set. In this setting, classic multi-armed bandit algorithms, which assume independence among arms and adopt non-parametric reward model, are inefficient, due to the large number of arms. By exploiting arm correlations based on a parametric reward model with arm features, contextual bandit algorithms are more efficient, but they can also suffer from large regret in practical applications, due to the reward estimation bias from mis-specified model assumption or incomplete features. In this paper, we propose a novel Bayesian framework, called Semi-Parametric Sampling (SPS), for this problem, which employs semi-parametric function as the reward model. Specifically, the parametric part of SPS, which models expected reward as a parametric function of arm feature, can efficiently eliminate poor arms from candidate set. The non-parametric part of SPS, which adopts nonparametric reward model, revises the parametric estimation to avoid estimation bias, especially on the remained candidate arms. We give an implementation of SPS, Linear SPS (LSPS), which utilizes linear function as the parametric part. In semi-parametric environment, theoretical analysis shows that LSPS achieves better regret bound (i.e. O̴(√N1−α dα √T) with α ∈ [0, 1])) than existing approaches. Also, experiments demonstrate the superiority of the proposed approach.

Subject: AAAI.2019 - Reasoning under Uncertainty

#25 Memory Bounded Open-Loop Planning in Large POMDPs Using Thompson Sampling [PDF] [Copy] [Kimi] [REL]

Authors: Thomy Phan, Lenz Belzner, Marie Kiermeier, Markus Friedrich, Kyrill Schmid, Claudia Linnhoff-Popien

State-of-the-art approaches to partially observable planning like POMCP are based on stochastic tree search. While these approaches are computationally efficient, they may still construct search trees of considerable size, which could limit the performance due to restricted memory resources. In this paper, we propose Partially Observable Stacked Thompson Sampling (POSTS), a memory bounded approach to openloop planning in large POMDPs, which optimizes a fixed size stack of Thompson Sampling bandits. We empirically evaluate POSTS in four large benchmark problems and compare its performance with different tree-based approaches. We show that POSTS achieves competitive performance compared to tree-based open-loop planning and offers a performancememory tradeoff, making it suitable for partially observable planning with highly restricted computational and memory resources.

Subject: AAAI.2019 - Reasoning under Uncertainty